A General Internal Regret-Free Strategy

نویسندگان

  • Ehud Lehrer
  • Eilon Solan
چکیده

We study sequential decision problems where the decision maker does not observe the states of nature, but rather receives a noisy signal, whose distribution depends on the current state and on the action that she plays. We do not assume that the decision maker considers the worst-case scenario, but rather has a response correspondence, which maps distributions over signals to subjective best responses. We extend the concept of internal regret-free strategy to this setup and provide an algorithm that generates such a strategy. Journal of Economic Literature classification numbers: C61, C72, D81, D82, D83

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Calibration and Internal no-Regret with Partial Monitoring

Calibrated strategies can be obtained by performing strategies that have no internal regret in some auxiliary game. Such strategies can be constructed explicitly with the use of Blackwell's approachability theorem , in an other auxiliary game. We establish the converse: a strategy that approaches a convex B-set can be derived from the construction of a calibrated strategy. We develop these tool...

متن کامل

Set-valued approachability and online learning with partial monitoring

Approachability has become a standard tool in analyzing learning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward: it belongs to a set rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop a simple and general...

متن کامل

Online Learning: Sufficient Statistics and the Burkholder Method

We uncover a fairly general principle in online learning: If regret can be (approximately) expressed as a function of certain “sufficient statistics” for the data sequence, then there exists a special Burkholder function that 1) can be used algorithmically to achieve the regret bound and 2) only depends on these sufficient statistics, not the entire data sequence, so that the online strategy is...

متن کامل

From External to Internal Regret

External regret compares the performance of an online algorithm, selecting among N actions, to the performance of the best of those actions in hindsight. Internal regret compares the loss of an online algorithm to the loss of a modified online algorithm, which consistently replaces one action by another. In this paper we give a simple generic reduction that, given an algorithm for the external ...

متن کامل

Online Learning with Transductive Regret

We study online learning with the general notion of transductive regret, that is regret with modification rules applying to expert sequences (as opposed to single experts) that are representable by weighted finite-state transducers. We show how transductive regret generalizes existing notions of regret, including: (1) external regret; (2) internal regret; (3) swap regret; and (4) conditional sw...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Dynamic Games and Applications

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2016